-
Notifications
You must be signed in to change notification settings - Fork 307
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
DAOS-17045 bio: drain inflight health collecting on teardown #15884
Conversation
When teardown a faulty device, ensure the inflight health collecting NVMe command completed before putting the io channel & open descriptor for health monitor. Signed-off-by: Niu Yawei <[email protected]>
Ticket title is 'server segfault on aurora with spdk' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we need port this to 2.6 as well?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
no unit tests for this area of code, is it considered safely covered by functional tests?
Yes, I'll backport it for 2.6.4 once it's landed to master. |
yes, this defect can't be detected by unit test, since it's a race happened on server mode. I think we might already have some functional test which marks device as faulty? but that's not necessarily discover this race issue. |
Test stage Functional Hardware Large completed with status FAILURE. https://build.hpdd.intel.com//job/daos-stack/job/daos/view/change-requests/job/PR-15884/2/execution/node/1420/log |
The test failure is DAOS-16737, which is not related to the PR. I think we'd force land this. |
When teardown a faulty device, ensure the inflight health collecting NVMe command completed before putting the io channel & open descriptor for health monitor. Signed-off-by: Niu Yawei <[email protected]>
…#15922) When teardown a faulty device, ensure the inflight health collecting NVMe command completed before putting the io channel & open descriptor for health monitor. Signed-off-by: Niu Yawei <[email protected]>
When teardown a faulty device, ensure the inflight health collecting NVMe command completed before putting the io channel & open descriptor for health monitor.
Before requesting gatekeeper:
Features:
(orTest-tag*
) commit pragma was used or there is a reason documented that there are no appropriate tags for this PR.Gatekeeper: